Poster: Nemo_bis Date: Dec 30, 2012 4:31am
Forum: opensource Subject: Re: Derivation failed from PDF: ppmtobmp failed with exit code: 1

Sometimes it fails with the same error, but with this before:

=== Heuristic Resolution Analysis ===
number of pages in PDF: 44
median page size: 63.014 x 83.611
No embedded images to analyze; setting resolution to default value of 300

Updating meta.xml with ppi = "300"

As with , changing ppi value doesn't help.
pdftk compress or pdfsizeopt didn't help (size was more or less the same), but doing pdf2ps and ps2pdf back increased size by 50 % and derive worked (except an unrelated error in documenttodjvu): it set dpi to a much lower 72.

Should all those PDFs be reuploaded like that or is setting the dpi to 72 in some way enough? They're 650 items for a total of about 9 GiB so I'd like to avoid reupload if possible...

Poster: Nemo_bis Date: Jan 6, 2013 1:02am
Forum: opensource Subject: Re: Derivation failed from PDF: ppmtobmp failed with exit code: 1

So, thanks to the very helpful IA staff I've found out that it was enough to force the ppi value: my fault that I used the wrong field, you have to use "fixed-ppi" (example). We've indeed set it to 75; in a case I further reduced it to 36, even.

The problem which causes that huge size ("median page size: 63.014 x 83.611") is that those PDFs use points instead of pixels, and those are not recognized well. ProcessJP2 usually takes only a few minutes, here it fails because otherwise it would produce image files dozens of times bigger than they should be.
I guess one should avoid uploading such PDFs, or if they're taken elsewhere a "fixed-ppi" value should be set from start; in most cases, including the one mentioned above, 300 is low enough.
pdfinfo shows the pts/points in "Page size", like this:

Creator: pdftk 1.40 -
Producer: itext-paulo-155 (
CreationDate: Tue Nov 25 09:54:20 2008
ModDate: Tue Nov 25 09:54:20 2008
Tagged: no
Pages: 84
Encrypted: no
Page size: 4537 x 6020 pts
File size: 22693186 bytes
Optimized: no
PDF version: 1.4

I also found some with "GPL Ghostscript 906 (ps2write)" and a commercial "professional" product I can't remember, but it doesn't to be a specific producer's "fault".